Skip to content

feat(cloud): minimal Foundry Cloud settings UI (BYOK Slice 5)#27

Open
0xharkirat wants to merge 5 commits into
mainfrom
feat/byok-slice5-foundry-ui
Open

feat(cloud): minimal Foundry Cloud settings UI (BYOK Slice 5)#27
0xharkirat wants to merge 5 commits into
mainfrom
feat/byok-slice5-foundry-ui

Conversation

@0xharkirat
Copy link
Copy Markdown
Contributor

Summary

Smallest possible Cloud Brain settings screen so users (not just devs with `--dart-define`) can plug in an Azure / Foundry deployment.

Two inputs. One save button. One clear button. That's it.

No provider dropdown, no mode toggle, no test-connection button, no cost meter — those land in later iterations once this round-trip is solid.

User flow

  1. Settings → Cloud brain (beta) row → Cloud Brain screen
  2. Paste the full endpoint URL from Azure portal (Keys & Endpoint tab) — the parser handles the verbatim shape:
    https://hark-ai-resource.cognitiveservices.azure.com/openai/deployments/hark-cloud-gpt-4-mini/chat/completions?api-version=2025-01-01-preview
    
  3. Paste the API key
  4. Tap Save — routing automatically flips to `cloudPreferred`
  5. Next voice command takes the cloud path

What's new

  • `AzureUrlParser` — parses verbatim Azure URL into `AzureConfig` fields:
    • `baseUrl` — `{scheme}://{host}/openai/deployments/{name}`
    • `model` — deployment name from the path
    • `apiVersion` — query parameter
      Supports classic `openai.azure.com`, `cognitiveservices.azure.com`, and Foundry `services.ai.azure.com` domains. Accepts URLs with or without the trailing `/chat/completions`. Throws `FormatException` with a user-friendly message on bad input. 13 unit tests for happy + error paths.
  • `CloudBrainScreen` — two text fields, save/clear buttons, inline error + status banners, privacy note. Pre-populates the URL field from existing config but never the API key (paste-to-replace).
  • `/settings/cloud-brain` route added to `hark_router`.
  • "Cloud brain (beta)" section + row in the main settings screen with an On/Off pill and `deployment name · mode` as subtitle when configured.

Test plan

  • `flutter analyze lib test` clean
  • `flutter test test/services/cloud/` 68/68 passing (13 new parser tests)
  • On device:
    • Settings → Cloud brain → paste URL + key → Save → success banner
    • Run a param-carrying command → `adb logcat | grep HarkCloudReq` shows the request, `HarkMetrics` shows `backend=openai_compat`
    • Restart app → row shows On + deployment name without re-entering anything
    • Tap Clear → row goes back to Off → next command uses local Qwen3
    • Bad URL (e.g. drop the api-version) → inline error banner, no save

Files

File Change
`lib/services/cloud/azure_url_parser.dart` new — verbatim Azure URL → AzureConfig parser
`test/services/cloud/azure_url_parser_test.dart` new — 13 tests
`lib/screens/cloud_brain_screen.dart` new — settings screen with two inputs + save/clear
`lib/router/hark_router.dart` New `/settings/cloud-brain` route
`lib/screens/settings_screen.dart` New "Cloud brain (beta)" section + `_CloudBrainRow` widget

Next slice

Slice 6 — failure handling, 24h result cache, cost meter, in-app fallback toasts on `CloudUnavailableError`. Plus deferred Slice 4 telemetry: which commands actually used cloud vs local fallback in the last session.

🤖 Generated with Claude Code

0xharkirat and others added 5 commits April 14, 2026 15:51
Adds the smallest possible Cloud Brain screen: paste full Azure
endpoint URL + API key, save to secure storage, switch routing to
cloudPreferred. No provider dropdown, no mode toggle, no test-
connection button — those land in later iterations.

- New AzureUrlParser parses the verbatim Azure portal URL into
  AzureConfig fields (baseUrl, deployment name as model, apiVersion
  query param). Supports classic openai.azure.com,
  cognitiveservices.azure.com, and Foundry services.ai.azure.com
  domains. Accepts URLs with or without trailing /chat/completions.
  13 unit tests covering happy + error paths.
- New CloudBrainScreen with two FTextField inputs (URL multiline,
  API key obscured), Save button (parses + persists + flips mode),
  Clear button (wipes config + reverts mode), inline error banner
  on parse failure, status banner on success. Pre-populates URL
  field from existing config but never the API key.
- New /settings/cloud-brain route in hark_router.
- New "Cloud brain (beta)" section + row in the main settings screen
  with on/off pill and the configured deployment name as subtitle.

User flow:

  1. Settings → Cloud brain (beta) row → Cloud Brain screen
  2. Paste full URL from Azure portal Keys & Endpoint tab
  3. Paste API key
  4. Save → routing flips to cloudPreferred
  5. Next voice command takes the cloud path

The parser tolerates the URL shape Azure surfaces in the portal
(classic /openai/deployments/{name}/chat/completions?api-version=...)
without forcing the user to pre-trim or break it into pieces.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The 404 path threw a CloudHardError with just the openai_dart
exception message, which loses the actual Azure response body
(`error.message`, `error.code`, `error.type`). Add HarkCloudErr
debug log lines per failure type so first-pass cloud testing can
inspect what Azure actually said.

Also enrich the CloudHardError thrown on 404 with the deployment
name we sent so the user sees what we asked for vs what Azure
couldn't find.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Microsoft surfaces two Azure OpenAI templates in their docs:

- Classic Azure OpenAI Service: api-key: <key> header
- Foundry serverless endpoints: Authorization: Bearer <key> header

Same URL shape, different auth header. The Foundry path is what every
modern model deployment uses (gpt-4.1-mini and friends). Sending
api-key to a Bearer-expecting endpoint returns 404 with body
{error: {code: 404, message: "Resource not found"}} instead of a
clean 401, because the path lookup fails before auth runs.

User reproduced this end-to-end: the Azure portal generated docs for
their hark-cloud-gpt-4-mini deployment use Authorization: Bearer.

Switch the adapter to use ApiKeyProvider (Bearer) for Azure too. The
api-version query parameter is still wired correctly via OpenAIConfig.
The kind dispatch collapses to a single shared branch since all four
OpenAI-compat providers now use the same auth header — kind is still
useful for future per-provider tweaks (Anthropic adapter dispatch,
Gemini-specific quirks).

If a future user has a legacy classic Azure OpenAI deployment that
only accepts api-key, we'll add a UI toggle then. For now the
Foundry-default behavior matches reality.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The provider-abstraction package was net-negative for our use case:

- We POST one well-known shape per call (chat/completions with tools
  + tool_choice) — no abstraction needed
- Its Azure auth defaults bit us in Slice 5 (AzureApiKeyProvider sent
  api-key header instead of Bearer; mismatched our Foundry serverless
  deployment which expects Authorization: Bearer)
- Even after switching to ApiKeyProvider it still 404'd, masking
  whatever the actual issue is behind a wire format we couldn't
  inspect
- 50KB dependency for what is now a 50-line http POST

Direct http gives us:
- Full visibility on the request bytes (matches the curl Microsoft
  surfaces in their portal verbatim)
- Single code path: same Authorization: Bearer header for OpenAI,
  Azure, Gemini, OpenRouter, custom — every modern OpenAI-compat
  endpoint accepts Bearer
- Tighter URL construction: append /chat/completions to baseUrl by
  string concat (Uri.resolve mishandles per-deployment URLs)
- Simpler error mapping: status code → CloudUnavailableError /
  CloudHardError without translating openai_dart exception types

Anthropic still throws CloudHardError here — Slice 7 gets the
dedicated AnthropicAdapter for the native tool_use shape.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Race condition surfaced in device testing: cloudProviderNotifier.build()
returns a default empty state immediately and kicks off _loadFromStorage()
fire-and-forget. If the first voice command fires before that load
completes — which is the cold-start path when the user goes straight to
a command without opening Settings first — cloudSlotFillerProvider sees
config=null and the resolver routes to local Qwen3.

Earlier "working" runs only worked because the user had opened the
Cloud Brain settings screen first, which warmed the secure-storage
load.

Fix: await CloudProviderNotifier.awaitInitialLoad() at the top of the
resolver's slotFill closure. Idempotent + cached, so the cost is one
secure-storage read on the very first command of a session and zero
thereafter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant